Accelerating Natural Gradient with Higher-Order Invariance
نویسندگان
چکیده
An appealing property of the natural gradient is that it is invariant to arbitrary differentiable reparameterizations of the model. However, this invariance property requires infinitesimal steps and is lost in practical implementations with small but finite step sizes. In this paper, we study invariance properties from a combined perspective of Riemannian geometry and numerical differential equation solving. We define the order of invariance of a numerical method to be its convergence order to an invariant solution. We propose to use higher-order integrators and corrections based on geodesics to obtain more invariant optimization trajectories. We prove the numerical convergence properties of geodesic corrected updates and show that they can be as computational efficient as plain natural gradient. Experimentally, we demonstrate that invariance leads to faster training and our techniques improve on traditional natural gradient in optimizing synthetic objectives as well as deep classifiers and autoencoders.
منابع مشابه
EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH CERN - PS DIVISION CERN/PS 2002-059 (RF) CLIC Note 532 CLIC 30 GHZ ACCELERATING STRUCTURE DEVELOPMENT
The main effects which limit accelerating gradient in CLIC (Compact Linear Collider) main linac accelerating structures are RF breakdown and pulsed surface heating. Recent highlights of the structure development program are presented, including demonstration of higher accelerating gradients using tungsten and a complete redesign of the CLIC main linac accelerating structure, based on reduced su...
متن کاملSimulations of Currents in X-band Accelerator Structures Using 2d and 3d Particle-in-cell Code
Accelerating gradient is one of the crucial parameters affecting design, construction and cost of next-generation linear accelerators. For a specified final energy, the gradient sets the accelerator length, and for a given accelerating structure and pulse repetition rate it determines power consumption. Accelerating gradients on the order of 100 MV/m have been reached in short ( 20 cm) standing...
متن کاملAdaptive blind source separation by second order statistics and natural gradient
Separation of sources that are mixed by an unknown (hence, ”blind”) mixing matrix is an important task for a wide range of applications. This paper presents an adaptive blind source separation method using second order statistics (SOS) and natural gradient. The SOS of observed data is shown to be sufficient for separating mutually uncorrelated sources provided that the temporal coherences of al...
متن کاملA Note on the Conformal Invariance of G-generalized Gradients
We consider generalized gradients in the general context of G-structures. They are natural first order differential operators acting on sections of vector bundles associated to irreducible G-representations. We study their geometric properties and show in particular their conformal invariance. 2000 Mathematics Subject Classification: Primary 53C10, 53A30, 58J60.
متن کاملTev Linear Collider Based on Conventional Technology
In order that it may be built within a reasonable length and with reasonable ac power consumption, a 5 TeV linear collider must employ an accelerating gradient and rf frequency which are both higher than for present 1 TeV collider designs. The required rf power per meter, which will also be higher than for 1 TeV designs, can be provided either by relatively conventional rf technology or by a tw...
متن کامل